social network analysis
Taggus: An Automated Pipeline for the Extraction of Characters' Social Networks from Portuguese Fiction Literature
Canário, Tiago G, Duarte, Catarina, Pinheiro, Flávio L., Pereira, João L. M.
Automatically identifying characters and their interactions from fiction books is, arguably, a complex task that requires pipelines that leverage multiple Natural Language Processing (NLP) methods, such as Named Entity Recognition (NER) and Part-of-speech (POS) tagging. However, these methods are not optimized for the task that leads to the construction of Social Networks of Characters. Indeed, the currently available methods tend to underperform, especially in less-represented languages, due to a lack of manually annotated data for training. Here, we propose a pipeline, which we call Taggus, to extract social networks from literary fiction works in Portuguese. Our results show that compared to readily available State-of-the-Art tools -- off-the-shelf NER tools and Large Language Models (ChatGPT) -- the resulting pipeline, which uses POS tagging and a combination of heuristics, achieves satisfying results with an average F1-Score of $94.1\%$ in the task of identifying characters and solving for co-reference and $75.9\%$ in interaction detection. These represent, respectively, an increase of $50.7\%$ and $22.3\%$ on results achieved by the readily available State-of-the-Art tools. Further steps to improve results are outlined, such as solutions for detecting relationships between characters. Limitations on the size and scope of our testing samples are acknowledged. The Taggus pipeline is publicly available to encourage development in this field for the Portuguese language.2
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- Europe > Portugal > Castelo Branco > Castelo Branco (0.04)
- (3 more...)
YT-30M: A multi-lingual multi-category dataset of YouTube comments
This paper introduces two large-scale multilingual comment datasets, YT-30M (and YT-100K) from YouTube. The analysis in this paper is performed on a smaller sample (YT-100K) of YT-30M. Both the datasets: YT-30M (full) and YT-100K (randomly selected 100K sample from YT-30M) are publicly released for further research. YT-30M (YT-100K) contains 32236173 (108694) comments posted by YouTube channel that belong to YouTube categories. Each comment is associated with a video ID, comment ID, commentor name, commentor channel ID, comment text, upvotes, original channel ID and category of the YouTube channel (e.g., 'News & Politics', 'Science & Technology', etc.).
Chronological Analysis of Rigvedic Mandalas using Social Networks
Prabhu, Shreekanth M, Radhakrishnan, Gopalpillai
Establishing the chronology of the Vedas has interested scholars for the last two centuries. The oldest among them is Rig-Veda which has ten Mandalas, each composed separately. In this paper, we look at deciphering plausible pointers to the internal chronology of the Mandalas, by focusing on Gods and Goddesses worshiped in different Mandalas. We apply text analysis to the Mandalas using Clustering Techniques based on Cosine Similarity. Then we represent the association of deities with Mandalas using a grid-based Social Network that is amenable to chronological analysis and demonstrates the benefits of using Social Network Analysis for the problem at hand. Further, we analyze references to rivers to arrive at additional correlations. The approach used can be deployed generically to analyze other kinds of references and mentions and arrive at more substantive inferences.
- Asia > India > Tamil Nadu > Chennai (0.05)
- Europe > Lithuania (0.04)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- (7 more...)
Adolescent relational behaviour and the obesity pandemic: A descriptive study applying social network analysis and machine learning techniques
Marqués-Sánchez, Pilar, Martínez-Fernández, María Cristina, Benítez-Andrades, José Alberto, Quiroga-Sánchez, Enedina, García-Ordás, María Teresa, Arias-Ramos, Natalia
Aim: To study the existence of subgroups by exploring the similarities between the attributes of the nodes of the groups, in relation to diet and gender and, to analyse the connectivity between groups based on aspects of similarities between them through SNA and artificial intelligence techniques. Methods: 235 students from 5 different educational centres participate in this study between March and December 2015. Data analysis carried out is divided into two blocks: social network analysis and unsupervised machine learning techniques. As for the social network analysis, the Girvan-Newman technique was applied to find the best number of cohesive groups within each of the friendship networks of the different classes analysed. Results: After applying Girvan-Newman in the three classes, the best division into clusters was respectively 2 for classroom A, 7 for classroom B and 6 for classroom C. There are significant differences between the groups and the gender and diet variables. After applying K-means using population diet as an input variable, a K-means clustering of 2 clusters for class A, 3 clusters for class B and 3 clusters for class C is obtained. Conclusion: Adolescents form subgroups within their classrooms. Subgroup cohesion is defined by the fact that nodes share similarities in aspects that influence obesity, they share attributes related to food quality and gender. The concept of homophily, related to SNA, justifies our results. Artificial intelligence techniques together with the application of the Girvan-Newman provide robustness to the structural analysis of similarities and cohesion between subgroups.
- Europe > Spain > Castile and León > León Province > León (0.04)
- South America > Brazil > Goiás (0.04)
- Oceania > Australia > South Australia (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Consumer Health (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)
- Information Technology > Services (0.84)
- Education > Educational Setting (0.68)
An Ontology-Based multi-domain model in Social Network Analysis: Experimental validation and case study
Benítez-Andrades, José Alberto, García-Rodríguez, Isaías, Benavides, Carmen, Aláiz-Moretón, Héctor, Gayo, José Emilio Labra
The use of social network theory and methods of analysis have been applied to different domains in recent years, including public health. The complete procedure for carrying out a social network analysis (SNA) is a time-consuming task that entails a series of steps in which the expert in social network analysis could make mistakes. This research presents a multi-domain knowledge model capable of automatically gathering data and carrying out different social network analyses in different domains, without errors and obtaining the same conclusions that an expert in SNA would obtain. The model is represented in an ontology called OntoSNAQA, which is made up of classes, properties and rules representing the domains of People, Questionnaires and Social Network Analysis. Besides the ontology itself, different rules are represented by SWRL and SPARQL queries. A Knowledge Based System was created using OntoSNAQA and applied to a real case study in order to show the advantages of the approach. Finally, the results of an SNA analysis obtained through the model were compared to those obtained from some of the most widely used SNA applications: UCINET, Pajek, Cytoscape and Gephi, to test and confirm the validity of the model.
- Europe > Spain > Castile and León > León Province > León (0.04)
- North America > United States > California (0.04)
- Europe > United Kingdom > England (0.04)
- (3 more...)
- Research Report (1.00)
- Questionnaire & Opinion Survey (0.72)
- Information Technology > Services (1.00)
- Health & Medicine > Consumer Health (1.00)
- Education > Educational Setting (1.00)
- (3 more...)
Creating a Systematic ESG (Environmental Social Governance) Scoring System Using Social Network Analysis and Machine Learning for More Sustainable Company Practices
Environmental Social Governance (ESG) is a widely used metric that measures the sustainability of a company practices. Currently, ESG is determined using self-reported corporate filings, which allows companies to portray themselves in an artificially positive light. As a result, ESG evaluation is subjective and inconsistent across raters, giving executives mixed signals on what to improve. This project aims to create a data-driven ESG evaluation system that can provide better guidance and more systemized scores by incorporating social sentiment. Social sentiment allows for more balanced perspectives which directly highlight public opinion, helping companies create more focused and impactful initiatives. To build this, Python web scrapers were developed to collect data from Wikipedia, Twitter, LinkedIn, and Google News for the S&P 500 companies. Data was then cleaned and passed through NLP algorithms to obtain sentiment scores for ESG subcategories. Using these features, machine-learning algorithms were trained and calibrated to S&P Global ESG Ratings to test their predictive capabilities. The Random-Forest model was the strongest model with a mean absolute error of 13.4% and a correlation of 26.1% (p-value 0.0372), showing encouraging results. Overall, measuring ESG social sentiment across sub-categories can help executives focus efforts on areas people care about most. Furthermore, this data-driven methodology can provide ratings for companies without coverage, allowing more socially responsible firms to thrive.
The use of new technologies to support Public Administration. Sentiment analysis and the case of the app IO
Miracula, Vincenzo, Picone, Antonio
Since 2005, there has been an increasing development of digitization within the public administration that sees the introduction of the use of technology as a privileged tool in the management of administrative activities. The main objective is to promote digitization in administrations in order to achieve greater efficiency in their activities in internal relations, between different administrations, and between the latter and private individuals. The entry of artificial intelligence into public action, however, needs to be accompanied by an adequate regulatory framework to guarantee the rights of those administered. The notion of digital transformation has gained significant attention in the literature[1]. Although approaches to the definition of digital transformation vary[2], most authors suggest that digital transformation involves the use of ICT technology to create fundamentally new capabilities in business, public administration[3] and people's lives[4].
- Asia > China > Liaoning Province > Fushun (0.05)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > Italy (0.04)
- Law (1.00)
- Government (1.00)
The impact of Twitter on political influence on the choice of a running mate: Social Network Analysis and Semantic Analysis -- A Review
Wanza, Immaculate, Kamuti, Irad, Gichohi, David, Gikunda, Kinyua
In this new era of social media, social networks are becoming increasingly important sources of user-generated content on the internet. These kinds of information resources, which include a lot of people's feelings, opinions, feedback, and reviews, are very useful for big businesses, markets, politics, journalism, and many other fields. Politics is one of the most talked-about and popular topics on social media networks right now. Many politicians use micro-blogging services like Twitter because they have a large number of followers and supporters on those networks. Politicians, political parties, political organizations, and foundations use social media networks to communicate with citizens ahead of time. Today, social media is used by hundreds of thousands of political groups and politicians. On these social media networks, every politician and political party has millions of followers, and politicians find new and innovative ways to urge individuals to participate in politics. Furthermore, social media assists politicians in various decision-making processes by providing recommendations, such as developing policies and strategies based on previous experiences, recommending and selecting suitable candidates for a particular constituency, recommending a suitable person for a particular position in the party, and launching a political campaign based on citizen sentiments on various issues and controversies, among other things. This research is a review on the use of social network analysis (SNA) and semantic analysis (SA) on the Twitter platform to study the supporters networks of political leaders because it can help in decision-making when predicting their political futures.
- Asia > Indonesia (0.05)
- South America > Colombia (0.04)
- Asia > Thailand (0.04)
- (3 more...)
- Information Technology > Services (1.00)
- Government (1.00)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.51)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.33)
Using artificial intelligence to manage extreme weather events
McGill study aims to make social media contributions more useful to crisis managers Can combining deep learning (DL)— a subfield of artificial intelligence— with social network analysis (SNA), make social media contributions about extreme weather events a useful tool for crisis managers, first responders and government scientists? An interdisciplinary team of McGill researchers has brought these tools to the forefront in an effort to understand and manage extreme weather events. The researchers found that by using a noise reduction mechanism, valuable information could be filtered from social media to better assess trouble spots and assess users’ reactions vis-à-vis extreme weather events. The results of the study are published in the Journal of Contingencies and Crisis Management. Diving into a sea of information “We reduced the noise by finding out who was being listened to, and which were authoritative sources,” explains Renee Sieber, Associate Professor in McGill’s Department of Geography and lead author of this study. “This ability is important because it is quite difficult to assess the validity of the information shared by Twitter users.” The team based their study on Twitter data from the March 2019 Nebraska floods in the United States, which caused over $1 billion in damage and widespread evacuations of residents. In total, over 1,200 tweets were analyzed and classified. “Social network analysis can identify where people get their information during an extreme weather event. Deep learning allows us to better understand the content of this information by classifying thousands of tweets into fixed categories, for example, ‘infrastructure and utilities damage’ or ‘sympathy and emotional support’,” says Sieber. The researchers then introduced a two-tiered DL classification model – a first in terms of integrating these methods in a way that could be useful to crisis managers. The study highlighted some issues regarding the use of social media analysis for this purpose, notably its failure to note that events are far more contextual than expected by labelled datasets, such as the CrisisNLP, and the lack of a universal language to categorize terms related to crisis management. The preliminary exploration performed by the researchers also found that a celebrity call out was featured prominently – this was indeed the case for the 2019 Nebraska floods, where a tweet from pop singer Justin Timberlake was shared by a large number of users, though it did not prove to be of use for crisis managers. “Our findings tell us that information content varies between different types of events, contrary to the belief that there is a universal language to categorize crisis management; this limits the use of labelled datasets on just a few types of events, as search terms may change from one event to another.” “The vast amount of social media data the public contributes about weather suggests it can provide critical information in crises, such as snowstorms, floods, and ice storms. We are currently exploring transferring this model to different types of weather crises and addressing the shortcomings of existing supervised approaches by combining these with other methods,” says Sieber. About this study “Using deep learning and social network analysis to understand and manage extreme flooding” by Renee Sieber and al. was published in the Journal of Contingencies and Crisis Management. This study was funded by Environment Canada. About McGill University Founded in Montreal, Quebec, in 1821, McGill University is Canada’s top ranked medical doctoral university. McGill is consistently ranked as one of the top universities, both nationally and internationally. It is a world-renowned institution of higher learning with research activities spanning two campuses, 11 faculties, 13 professional schools, 300 programs of study and over 40,000 students, including more than 10,200 graduate students. McGill attracts students from over 150 countries around the world, its 12,800 international students making up 31% per cent of the student body. Over half of McGill students claim a first language other than English, including approximately 19% of our students who say French is their mother tongue.
- North America > Canada > Quebec > Montreal (0.78)
- North America > United States > Nebraska (0.47)
- Information Technology > Services (1.00)
- Education > Educational Setting > Higher Education (0.57)
Social Networks Analysis to Retrieve Critical Comments on Online Platforms
Social networks are rich source of data to analyze user habits in all aspects of life. User's behavior is decisive component of a health system in various countries. Promoting good behavior can improve the public health significantly. In this work, we develop a new model for social network analysis by using text analysis approach. We define each user reaction to global pandemic with analyzing his online behavior. Clustering a group of online users with similar habits, help to find how virus spread in different societies. Promoting the healthy life style in the high risk online users of social media have significant effect on public health and reducing the effect of global pandemic. In this work, we introduce a new approach to clustering habits based on user activities on social media in the time of pandemic and recommend a machine learning model to promote health in the online platforms.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Hubei Province > Wuhan (0.04)